SpatialML: Annotation Scheme, Corpora, and Tools

نویسندگان

  • Inderjeet Mani
  • Janet Hitzeman
  • Justin Richer
  • Dave Harris
  • Rob Quimby
  • Ben Wellner
چکیده

SpatialML is an annotation scheme for marking up references to places in natural language. It covers both named and nominal references to places, grounding them where possible with geo-coordinates, including both relative and absolute locations, and characterizes relationships among places in terms of a region calculus. A freely available annotation editor has been developed for SpatialML, along with a corpus of annotated documents released by the Linguistic Data Consortium. Inter-annotator agreement on SpatialML extents is 77.0 F-measure on that corpus, and 92.3 F-measure on a ProMED corpus. Disambiguation agreement on geo-coordinates is 71.85 F-measure on the latter corpus. An automatic tagger for SpatialML extents scores 78.5 F-measure. A disambiguator scores 93.0 F-measure. In adapting the extent tagger to new domains, merging the training data from the above corpus with annotated data in the new domain provides the best performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SpatialML: annotation scheme, resources, and evaluation

SpatialML is an annotation scheme for marking up references to places in natural language. It covers both named and nominal references to places, grounding them where possible with geocoordinates, and characterizes relationships among places in terms of a region calculus. A freely available annotation editor has been developed for SpatialML, along with several annotated corpora. Inter-annotator...

متن کامل

eSpaceML: An Event-Driven Spatial Annotation Framework∗

This paper proposes eSpaceML as a representation scheme for annotating eventdriven spatial expressions in natural language. It adopts SpatialML (MITRE, 2009) and ISO-Space (ISO, 2010) as a basis for the development of a novel, distributed spatial annotation scheme. SpatialML focuses on the annotation of spatial locations and their topological relations, while both ISO-Space and eSpaceML attempt...

متن کامل

An annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies

A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...

متن کامل

Developing an Annotation Scheme for ELL Spelling Errors

This paper describes an XML annotation scheme for English Language Learner (ELL) spelling errors in learner corpora which can be used to create standardized, annotated ELL error corpora for use by researchers who are developing spelling correction tools for ELLs. We also provide an error taxonomy (with examples of each error type) upon which the scheme was based.

متن کامل

Briefly Noted English for the Computer: The SUSANNE Corpus and Analytic Scheme

Over the past 10–20 years, there has been increasing interest in grammatical / syntactic annotation schemes for corpora. Annotated corpora are essential for training and testing taggers and parsers, for describing the use of lexical and grammatical features, and for comprehensive analyses of registers or sublanguages. Several annotation schemes have been developed over this period, including bo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008